Coevolutionary Nash in poker games

نویسندگان

  • Frans A. Oliehoek
  • Nikos A. Vlassis
  • Edwin D. de Jong
چکیده

We address the problem of learning good policies in poker games. The classical game theoretic approach to this problem specifies a Nash equilibrium solution, i.e., a pair of secure mixed (randomized) policies. We describe a new approach for calculating such secure policies based on coevolution. Here, populations of pure policies for both players are simultaneously evolved by repeated comparisons against each other, and secure mixed policies are computed from both populations by linear programming. The search heuristic for adding new candidate pure policies involves computing a best-response pure policy (by solving a POMDP) that provides a worst-case payoff for each mixed policy. We provide experimental results suggesting that a Nash equilibrium policy can be approximated in relatively few iterations, thereby producing mixed policies with relatively small support. We conclude that this is a promising direction of research and provide directions for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Game theory and AI: a unified approach to poker games

This thesis focuses on decision making in partially observable card games and, in particular, poker games. An attempt is made to outline both the game theoretic, as an agent-centric approach to such games, analyzing differences and similarities, as well as strong and weaker points and finally proposing a view to make a tradeoff between these. The game theoretic approach for this type of games w...

متن کامل

Using counterfactual regret minimization to create competitive multiplayer poker agents

Games are used to evaluate and advance Multiagent and Artificial Intelligence techniques. Most of these games are deterministic with perfect information (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information game, all information is visible to all players. However, many real-world scenarios with competing agents are stochastic (non-deterministic) with...

متن کامل

Best-response play in partially observable card games

We address the problem of how to play optimally against a fixed opponent in a twoplayer card game with partial information like poker. A game theoretic approach to this problem would specify a pair of stochastic policies that are best-responses to each other, i.e., a Nash equilibrium. Although such a Nash-optimal policy guarantees a lower bound to the attainable payoff against any opponent, it ...

متن کامل

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable endto-end approach to learning approximate Nash equilibria without any prior knowledge. Our method combines fictitious sel...

متن کامل

A simple and numerically stable primal-dual algorithm for computing Nash-equilibria in sequential games with incomplete information

We present a simple primal-dual algorithm for computing approximate Nash equilibria in two-person zero-sum sequential games with incomplete information and perfect recall (like Texas Hold’em poker). Our algorithm only performs basic iterations (i.e matvec multiplications, clipping, etc., and no calls to external first-order oracles, no matrix inversions, etc.) and is applicable to a broad class...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005